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Abstract. Complex dynamical networks appear in a wide range of physical, 
biological, and engineering systems. The coupling of subsystems with varying 
time scales often results in multirate behavior. During the simulation of highly 
integrated circuits, for example, only a few elements underlie changing signals 
whereas the major part—usually up to 80 or even 90 per cent—remains latent. 
Standard integration schemes discretize the entire circuit with a single step size 
which is mainly limited by the accuracy requirements of the rapidly changing 
subcircuits [3]. It is of a particular interest to speed up the simulation without 
a significant loss of accuracy. By exploiting the latency of the system, only a 
fraction of the equations has to be formulated and solved at a given time point. 

Gunther and Rentrop [4] suggest that multirate strategies must be based 
both on the numerical information of the integration scheme and on the topol¬ 
ogy of the circuit. In this paper, we will introduce a directed graph describing 
the interdependency of the underlying system and propose Runge-Kutta meth¬ 
ods which utilize the signal flow of the system in order to identify and exploit 
inactive regions. Furthermore, we describe an extension of these methods to 
identify and exploit periodic subsystems. 


1. Introduction. In this paper, we will consider initial value problems 

x(t) = f(t,x(t)), 
x(t 0 )=x 0 , 

with t £ I C ffi. and / :IxD-> R n , D C R". A fundamental class of numerical 
solvers are one-step methods of the form 

x m+1 =x m + h$(t m ,x m ,h), (2) 

where $ is referred to as the increment function. Important examples of one-step 
methods are Runge-Kutta methods. The increment function of a general s-stage 
Runge-Kutta method is given by 

S 

(3a) 

<7=1 
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where 


k q = f(t m + c q h, x m + h a qr k r ). 


(3b) 


The coefficients a qr , b q , and c q are often arranged in form of the so-called Butcher 
tableau 


( 4 ) 
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If the matrix A is strictly lower triangular, then the Runge-Kutta method is called 
explicit. Otherwise, the method is said to be implicit. 

2. Time-driven ordinary differential equations. Without loss of generality, 
the ordinary differential equation (1) can be rewritten as 

( 5 ) 

with external variables Xe £ and internal variables Xi G K n/ . That is, we 
split the system into two subsystems and introduce additional variables which can 
be explicitly written as a function of the time t. The dimension of the input vector 
Xe depends on the number of different time-dependent terms, the dimension of 
the internal vector xj is equal to the number of equations of the original system. 
We introduce this partitioning to measure the influence of the input signals on the 
internal variables and to generate a model of the signal flow. 

From now on, for the sake of simplicity, we will write the system—to which we 
will refer as a time-driven ordinary differential equation —as 


Xe 

Xi 


= f(t, x), with x = 


xe 

xi 


and / = 


/e 

fi 


( 6 ) 


Thus, xE,i = Xi and Xj^ = x nE+ i. Let n = ue + ni denote the size of the whole 
system again. 

For a time-driven ordinary differential equation, a one-step method is of the form 


r m +1 

l e 

^ ra +1 


L E 


Ax’) 

Ax’) 


with 


Ax£ = f E (t m+1 ) - f E (t m ), 


Ax™ = h<f>{t m ,x m ,h). 

The increment function of a Runge-Kutta method can now be rewritten as 

S 

9=1 


( 7 ) 


( 8 ) 


(9a) 


where 


k% = + c q h), 

S 

kl = fi(k q E,x? + h^aqrkj). 


r—1 


(9b) 
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3. Dependency graph. Given a time-driven ordinary differential equation, we 
want to analyze how changes of the input variables x e affect the internal variables 
Xi and how the signals propagate through the system. To this end, we derive a 
directed graph which represents the structure of the system. 

Define (n) = {1, ..., n} to be the set of indices. Since in general the functions fi, 
i £ (n), do not depend on all variables Xj, j £ (n), we introduce input and output 
sets for each variable to describe the dependency on other variables. 


Definition 3.1 (Input and output sets). Define the input set of a i £ (n), to be 

dfi 


• Xi = < X 


dxj 


t 0, j £ (n) }. 


Analogously, define the output set to be 

9fj 


Xi • = < x 


^°,j£(nU. 


( 10 ) 


( 11 ) 


That is, the variable Xi depends on Xj if the value of Xj is required for the 
evaluation of fi. The input and output sets induce a directed graph with the 
vertices being the variables and the edges being the dependency relations between 
the variables. 


Definition 3.2 (Dependency graph). For a given time-driven ordinary differential 
equation, define the dependency graph by £ d ), with = {Oi,..., 0„} 

and <E d = {(tq,^) | a£ mx j, i,j £ (n)}. 

If it is clear which differential equation is meant, we will simply write & d . The 
dependency graph of large-scale dynamical networks can be very sparse since the 
subsystems are often strongly coupled inside but only connected to a few other 
subsystems of the network. 


Example 1. 

1. Consider the linear differential equation 


"x\t) = x(t) + x(t), 


which is equivalent to the first-order system 


±l{t) 


'0 10 0' 


X\ (t) 

X 2 (t) 


0 0 10 


x 2 (t) 

x 3 (t.) 


0 0 0 1 


x 3 (t) 

Xi {t) 


0 10 1 


Xi (t)_ 


S. y ^ 

A 


The input and output sets are 

• xi = {a; 2 }, X\ • = 0, 

• a; 2 = {* 3 }, cc 2 • = {aai, ac 4 }, 

• X 3 = {X4:}, x 3 » = {x 2 }i 

•X 4 = {x 2 ,X 4 }, z 4 * = {x 3 ,X A }. 


The differential equation is an equation of order three in x(t). This can also 
be seen in the dependency graph, which is shown in Figure 1, since aq depends 
only on x 2 and can be obtained by integration. Moreover, the transposed 
system matrix A T is the adjacency matrix of he. ® d = &(A T ). 
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Figure 1. Dependency graph ©d of the linear system. 


2. Given the inverter chain of length N shown in Figure 2, the corresponding 
circuit equations can be written as a time-driven ordinary differential equation 
with 




0 

V dd 

Va{t) 

g(y i,v 2 ,v 3 ,vi) 

g{v i,v 2 ,v 4 ,v 5 ) 


g(v i,v 2 ,v N+2 ,v N+3 ) 


Here, Ue = 3 and n/ = N. The function g consists of the characteristic 
equations of the modules connected to the individual nodes and can be written 
as 


g(vi,V 2 ,V i - 1 ,V i ) = {ids, n{Vi, Vi-1, + Ids, p( V i, Vi-l, v 2 ))- 

We use the Shichman-Hodges model [6] to describe the drain-source current 
ids °f the pMOS and nMOS transistors. 

Vdd Vdd Vdd 




Figure 2. Inverter chain of length N. 


Although the ground voltage and the positive supply voltage Vdd are con¬ 
stant over time, we introduce additional variables since this assignment leads 
to a natural correlation between the nodes and the vertices tq. In addition, 
it allows for a straightforward graph-based approach to generate the system of 
equations and the dependency graph. The Jacobian ^ exhibits the following 
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structure 


* * * 

* * * 

* * * 
where empty places denote partial derivatives identical to zero. Figure 3 shows 
the dependency graph of the inverter chain. Since the constant voltages Vi and 
V 2 have no influence on the dynamic signal flow, the corresponding vertices 
and associated edges have been omitted due to visualization reasons. 


df 

dv 



Figure 3. Dependency graph 0^ of the inverter chain. 


In the following, we often identify Xi with tq. Each internal vertex of the de¬ 
pendency graph represents a one-dimensional ordinary differential equation that is 
coupled to other one-dimensional systems. Generally speaking, a time-driven ordi¬ 
nary differential equation together with its dependency graph can be regarded as a 
coupled cell system [1, 2] with additional time-dependent inputs. 

4. Signal-flow based Runge Kutta methods. During the simulation of big 
and loosely coupled networks, different subsystems often exhibit different rates of 
activity. That is, the values in some parts of the network change rapidly, while in 
other parts the values change very slowly or do not change at all. The active regions 
usually vary over time so that a previously inactive region undergoes quick changes 
and vice versa. 

Consider for example the inverter chain. If we apply an input signal, then, 
generally speaking, this input signal is reversed repeatedly with a small time delay 
so that it seems to flow continuously through the circuit. The step size control of 
standard integration schemes depends mainly on the fastest changing variables. As 
a result, even the inactive signals have to be recomputed at every time step unless 
multirate integration schemes or other techniques to exploit the latency are used. 
We will propose an integration scheme which utilizes the underlying structure of 
the system. 

With the definitions in Section 3, it is possible to determine which values of x m 
are necessary to compute the new values of x m+1 , namely, for the update of x ™, all 
values of the variables of the input set • x, t are required. Since the external variables 
% E , i , i £ (tie), depend only on the time t, the input sets are empty, i.e. »Xe, i = 0. 
The update of the internal values au,i, i £ (nj), requires the evaluation of fu. and 
thus the values of »xi^. To identify latent regions, we have to distinguish between 
the different vertex types. 

Definition 4.1 (Semi-latency). Let t m be the current time point and t m ~ l the 
previous time point. 
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1. An external variable Xe,i, i £ (ng), is said to be semi-latent at t m if 

+ c q h ) = /g,i(t m_1 + c q h ) (12) 

for all q = 1,..., s. 

2. An internal variable xj^, i G (ni ), is defined to be semi-latent if 

$ i {t m - 1 ,x m - 1 ,h)= 0. (13) 

The definition implies that xf i = xff 1 for all semi-latent internal variables. 
Whether a vertex is semi-latent at a specific time point is not known until all the 
values have been evaluated, but since our aim is to reduce the number of function 
evaluations, we want to mark vertices which need not be recomputed. Therefore, 
we introduce an additional concept. 

Definition 4.2 (Latency). A variable Xi, i G (n), is called latent of order 1 if Xi 
and all variables of the set *x, are semi-latent. Additionally, a latent variable Xi 
is defined to be latent of order v if all variables in • x, are at least latent of order 
v- 1. 


Let e be a user-defined error tolerance. For numerical computations, the semi¬ 
latency conditions are replaced by lAx^” 1 ! < £ and | Ax 7 f l ~ 1 \ < e, respectively. In 
order to illustrate the different states of activity, we simulate the inverter chain. 

Example 2. If the inverter chain is excited with a given input signal, then this 
signal flows—reversed at each inverter —through the circuit, as described above. 
Figure 4 shows the voltages and activity states resulting when the circuit is excited 
with the displayed piecewise linear function. With a view to a better visualization, 
the respective activity states of the vertices are slightly shifted upward. Clearly, only 
a few vertices are active at each time point and these active regions flow through 
the dependency graph. 

The example shows that the vertices are latent during the major part of the 
simulation, but each vertex at a different time. Below, we will propose modified 
Runge-Kutta methods for time-driven ordinary differential equations which take 
into account the dependency graph and the signal flow of the underlying system. 
The aim is to reduce the number of function evaluations without a huge loss of 
accuracy by exploiting the inherent latency. Since for some applications the function 
evaluations are time-consuming, whereas the update of the dependency graph can 
be accomplished in linear time, this approach offers the possibility to conceivably 
speed up the simulation. 

4.1. Explicit Runge Kutta methods. For the computation of the vectors kf, 
and kj, q = 1, ...,s, in (9), it is necessary to evaluate the functions /g and //, 
respectively. The functions fij, i G (ni ), have to be recomputed if only one of the 
variables of the input set • x/ j i is active or semi-latent. If x/^ is latent of a certain 
order, then we can reuse the previous value. 


Definition 4.3 (Signal-flow based Runge-Kutta method). Given a time-driven 
ordinary differential equation, a signal-flow based Runge-Kutta method is defined 

by 

*e +1 =*e+ A x%, 


X 


ra +1 

I,i 


L I,i ’ 
r m 
c i,i - 


Ax 


i,a 


if Xfi is latent of order s, 
otherwise, 


( 14 ) 
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Figure 4. Excitation of the inverter chain with a piecewise linear 
function, a) The dotted trajectories show the input function and 
the voltages at intermediate vertices, the thin horizontal lines in 
the corresponding color the activity state. Here, 0 denotes active, 
1 semi-latent, and 2 latent, respectively, b) Structure of and 
i'i at time 1, 2, 3, and 4 for a threshold of 10 -4 . c) Activity states 
at time 1, 2, 3, and 4, where red vertices represent active, yellow 
vertices semi-latent, and green vertices latent regions. 
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for all i £ (nfl). Here, s is again the number of stages. The vectors Axff and Ax™ 
are as defined in (8). 

Provided that we use exact computation, the following theorem holds. 

Theorem 4.4. The explicit Runge-Kutta methods and the corresponding signal- 
flow based methods are equivalent. 


Proof. In the proof, we add the superscript to or m — 1 to the stages to differentiate 
between the different time points. Let Xz,i be latent at t™ } i.e. x m_1 , h) = 0 

and 


fE,j {t™ + Cqh ) = fE,j{t m 1 + Cqh ) 


t m ~ l 
j.m —1 rjm— 1 


"'TP. A - ft- T- 


x E,j ~ ™E,j y X E,j £ 

Qflt™- 1 , x™~\ h) = 0 =► Xf d = X ™- 1 Vs/J £ • Xi ti . 

For q = 1, we have ci = 0 and thus 

,m,l __ r / m m\ _ r ( m—l m- 1\ _ .m-1,1 
"'J,i — JI,i\ x Ei x I ) — JI,i\ x E i x I I ~ ™I,i 

since depends only on the values of the input set • Xi^ and these values are the 
same as in the previous time step by definition. Now, assume that x/y is latent of 
order 2, i.e. all inputs of x/y are at least latent of order 1. If follows that 


7 ?7i,2 £ /im,2 rn , l 7 m,l\ 

kjf = Ji,i{k E ’ ,Xj + ha 21 kj’ ) 

l,m— 1,2 m—l 
, X T 


r m— 1,2 


£ /7 m— i.z rn—L , 7 7 m— i,i\ 7 m- 

= Ji,i{k E ,Xj +ha 2 \k I ’) = k Ii 

using the same reasoning again. Furthermore, by induction it can be shown that 

K? = fiA^,*? + hY,a qr k?' T ) 


- fr ~™—l 

— j i,i \^e > x 1 


q-l 

n k m ~ 1 ’ r 

lb 7 Lbqf lb J 

r =1 


) = kT, 


m—l,q 


if xi t i is latent of order q and 


T/ 1 =xT !i + h$ i {t m ,x m ,h) 

= x? tl + hY j b q kTf 

9=1 

s 

= x ™” 1 + h ^ b q k r pfl t,q 

9=1 

= x™- 1 + h$flt m -\x m -\h) = xZ 


if xi t i is latent of order s. 


□ 


For numerical computations, we do not update a variable if it is latent of order 
at least one assuming that the influence of longer paths is negligibly small. In 
the following, we will abbreviate the standard classical fourth-order Runge-Kutta 
method as RK and the corresponding signal-flow based method as sfRK. 

Example 3. Consider once again the inverter chain, which is a popular benchmark 
problem for multirate integration schemes. To analyze the efficiency of the signal- 
flow based standard Runge-Kutta method, we simulate the inverter chain of length 
N = 100 with variably time-consuming function evaluations and different rates of 
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inherent latency. To vary the amount of latency, we apply periodic input functions 
with different delays between two adjacent pulse signals, as shown in Figure 5. The 
complexity of the transistor model is increased by artificially adding terms which 
do not affect the solution of the system. 



Figure 5. Piecewise linear input function with varying delay AT 
to emulate latency. 

The runtimes of the simulation with both the standard Runge-Kutta method 
and the corresponding signal-flow based method for varying model complexities 
and input functions are shown in Figure 6. Here, the time interval is I = [0,40], 
the step size h — Ag, and the latency parameter e = 10 -6 . While the runtime 
of RK does not depend on the inherent latency, the runtime of sfRK decreases 
with increasing latency. Furthermore, the more complex the transistor model, the 
bigger the speedup of the signal-flow based integration scheme due to the reduced 
number of function evaluations. Table 1 contains the number of transistor model 
evaluations for different values of AT. The influence of e on the speedup of sfRK 
and the average difference per step between RK and sfRK for a fixed delay AT = 10 
are shown in Figure 7. 

Table 1. Number of transistor model evaluations of RK and sfRK. 


AT 

0 

5 

10 

15 

20 

RK 

3200000 

3200000 

3200000 

3200000 

3200000 

sfRK 

2317152 

1046664 

649976 

479360 

413024 


We can reduce the number of function evaluations even for AT = 0 since at the 
beginning of the simulation the circuit is in a steady state and it takes a short time 
until the input signal reaches the last inverter. During that time, parts of the circuit 
are inactive and need not be evaluated. 

Note that the deviation does not depend on the complexity since only artificial 
terms were introduced to model different complexities of the transistor model. 

4.2. Implicit Runge-Kutta methods. The stages of implicit Runge-Kutta meth¬ 
ods cannot be evaluated successively. At each time point, a system of nonlinear 
equations has to be solved. To solve these systems with the Newton-Raphson 
method, the Jacobian has to be computed. For the transient analysis of 
integrated circuits, this can be accomplished efficiently using so-called element 
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RK 


sfRK 



complexity 0 0 latency complexity 0 0 latency 

RK vs. sfRK 



complexity 


latency 


20 


Figure 6. Influence of the complexity and latency on the runtime 
of RK and sfRK. 


Speedup 


Deviation 




Figure 7. Speedup and deviation of sfRK as a function of e. 


stamps [3]. Every time the right-hand side // is evaluated, the Jacobian — 
if needed—is generated simultaneously. 

However, only the nonlinear equations that correspond to active regions will be 
solved assuming that the influence of and on the latent regions is negligibly small. 
Furthermore, it is then only necessary to compute and factorize the fraction of the 
Jacobian which represents the active part. That is, we can exploit the latency also 
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on the level of the nonlinear and linear systems of equations. In our implementation, 
a variable is not updated if it is at least latent of order one, the influence of longer 
paths is neglected again. 

In the following, we will consider in particular the trapezoidal rule, which is 
frequently used for the simulation of integrated circuits. Since the second version 
of Spice most circuit simulators apply either the trapezoidal rule or BDF schemes 
to solve the circuit equations [3]. We will denote the trapezoidal rule abbreviatory 
as TR and the signal-flow based trapezoidal rule as sfTR. 

The increment function of the trapezoidal rule tailored to time-driven ordinary 
differential equations can be written as 

= \ (fi(x%,x?) + f I (x™ +1 ,x? +1 )) . (15) 

That is, at each time step a system of nonlinear equations 

F(z) := z -xT-^ (//(*£, x?) + fi(x™ + \z)) = 0 (16) 

has to be solved. Using the Newton-Raphson method, this leads to the iteration 

Zk+ 1 = Zk + Az k , (17) 

where A z k is the solution of the linear system of equations 

(/ - A** = -z k + x? + ^ (fi(x%,x?) + fi(x™ +1 ,z k )) . (18) 

As a starting point for the iteration, we use zq = x™. 

Example 4. To facilitate comparisons of the explicit Runge-Kutta method and the 
implicit trapezoidal rule, we repeat the simulation of the inverter chain of length 
N = 100 with the settings described in Example 3. Figure 8 shows the runtimes 
of the simulation with both the standard trapezoidal rule and the signal-flow based 
trapezoidal rule for varying model complexities and input functions again. We use 
the Newton-Raphson method to solve the nonlinear systems and the LU factor¬ 
ization to solve the resulting linear systems of equations. For the signal-flow based 
simulation, only the active and senri-latent parts of the nonlinear and linear systems 
of equations are generated and solved. Here, the influence of the model complexity 
is negligible since the runtime of the LU factorizations is dominating. Table 2 con¬ 
tains the number of required transistor model evaluations. The influence of e on 
the speedup of sfTR and the average deviation per step for a fixed delay AT = 10 
are shown in Figure 9. 

If the delay AT of the input function is bigger than 12 or the period is bigger 
than 14, respectively, then the trapezoidal rule depends on the latency. This is 
due to the fact that the signal needs approximately this period of time to pass all 
inverters. For larger values of AT, there is a small time interval where all vertices 
are latent and thus the Newton-Raphson method needs less iterations to converge. 

Table 2. Number of transistor model evaluations of TR and sfTR. 


AT 

0 

5 

10 

15 

20 

TR 

sfTR 

2353600 

1736618 

2353600 

784214 

2353600 

486788 

2075200 

357118 

1881600 

307582 
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TR sfTR 



complexity 0 0 | ate ncy complexity 0 0 | atency 


TR vs. sfTR 



complexity 0 0 l a tency 


Figure 8. Influence of the complexity and latency on the runtime 
of TR and sfTR . 


Speedup 


Deviation 




Figure 9. Speedup and deviation of sfTR as a function of e. 


5. Generalization to periodic systems. In power electronic circuits, diodes and 
semiconductor switches are constantly changing their status and a steady state 
condition is by definition reached when the waveforms are periodic with a time 
period T which depends on the specific nature of the circuit [5] . The time scales of 
these circuits may differ by several orders of magnitude and the simulation requires 
very small step sizes to cover the dynamics of the fastest subsystems. The maximum 
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simulation time, on the other hand, is usually determined by the slowest subsystems. 
Thus, a detailed simulation of power electronic circuits is in general very time- 
consuming. Now, we want to extend the signal-flow based approach to identify and 
exploit not the latency but the periodicity of subsystems in order to reduce the 
runtime of the simulation. 

Definition 5.1 (Semi-periodicity). Let T be the fundamental period of the system 
and h = p £ N, the step size. 

1. An external variable xe,u i £ (ng), is said to be semi-periodic at t m if 

+ c q h) = + Cgh) (19) 

for all q = 1,..., s. 

2. An internal variable Xi^, i £ (nr), is defined to be semi-periodic if 

( 2 °) 

In contrast to the definition of semi-latency, the variables are not compared to 
the previous time step, but to the corresponding time step of the previous period. 
Roughly speaking, latency can be regarded as a special case of periodicity for which 
P= 1- 

Definition 5.2 (Periodicity). A variable Xi, i £ (n), is called periodic of order 1, 
if x, and all variables of the set • Xi are semi-periodic. Additionally, a periodic 
variable Xi is defined to be periodic of order v if all variables in • Xi are at least 
periodic of order v — 1. 

Let e be again a given error tolerance. For numerical computations, the semi¬ 
periodicity conditions are replaced by \xff — x r ff~ p \ < e and \xf l i — x r f'~ p \ < e, 
respectively. Analogously to the latency-based methods, we do not update a variable 
if it is periodic of order one or higher. To illustrate the different activity states, we 
use the inverter chain. 

Example 5. The inverter chain is excited with a piecewise linear function which is 
periodic with T = 4 for t > 1. The input function and the resulting node voltages 
at intermediate vertices are shown in Figure 10. 


Latency Periodicity 



Figure 10. Comparison of latency and periodicity. The curves 
show the node voltages V 3 , V 7 , and vn, the thin horizontal lines 
the corresponding states of the variables. Here, 0 denotes active, 1 
semi-latent or semi-periodic, and 2 latent or periodic, respectively. 
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Definition 5.3 (Signal-flow based periodic Runge-Kutta method). An explicit 
signal-flow based periodic Runge-Kutta method for a time-driven ordinary differen¬ 
tial equation is defined by 


,m+1 

_ rr-'m 

E 

~ X E 

,m+1 

-! x 

7,2 

- \ . 


1 X 


+ Aa$, 

m-p -\-1 
7,2 > 

m I A^ra 
7,2 ^ ^^ 7 , 2 ’ 


if rt'/y; is periodic of order s, 
otherwise, 


( 21 ) 


for i £ (ri/). 


To exploit the periodicity of subsystems and to reduce the number of function 
evaluations, we store the vectors x m ~ p+1 , x m ~ p+2 ,..., x m in a circular buffer. 


Theorem 5.4. The explicit Runge-Kutta methods and the corresponding signal- 
flow based methods for periodic systems are equivalent. 

Proof. The proof is almost identical to the proof of Theorem 4.4. We add again the 
superscript m or m — p to the stages to differentiate between the time points. Let 


Xl,i 

be periodic at t m , i.e. xfi ■ 

= 

and 


fE,j (t m + c q h) 

= fE,j(i 

1 m ~ P + Cqh) \/XEJ e •Xi'i 


X I,j 

m—p 

~ x i,j 

Vx/j £ •xj^. 

For 

q = 1, this yields 




k T/ = fl,i( X E 

,X?) = 

f T („ m -P rr m -P\ — um-p,l 
JI,l\X E ,Xj ) — 

and 

hence by induction 




, m ,9 _ r (, m,q 

xT + h 

9-1 

,, k m ’ r \ 

2_^ a qr^I ) 


r —1 

q -1 

= fi,i (fcr™, <~ P + h ]T a qr k?~ p ’ r ) = kT~ p ' q 

r —1 

for each variable x which is periodic of order q. Consequently, 

X™f l = x? ti + h^i(t m ,X m ,h) 

S 

= x?,i + hJ2 b q k Tf 

9=1 

=x-r+h±b q kT,r- 

9=1 

= x™~ p + h r ~ p , x m ~ p , h) = x™~ p+ \ 
for each xn which is periodic of order s. □ 

Now, let sfpRK denote the signal-flow based standard fourth-order Runge-Kutta 
method for periodic systems. 

Example 6. To compare the signal-flow based method for periodic systems with 
the standard Runge-Kutta method, we simulate the inverter chain as described in 
Example 3. The results are shown in Figure 11 and Table 3. Here, the number of 
function evaluations rises with increasing AT since the time interval in which the 
system is periodic according to our definition decreases. 
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Figure 11 . Influence of the complexity and latency on the runtime 
of RK and sfpRK. 


Table 3. Number of transistor model evaluations of RK and sfpRK. 


AT 

0 

5 

10 

15 

20 

RK 

3200000 

3200000 

3200000 

3200000 

3200000 

sfpRK 

422328 

700936 

999672 

1360800 

1760800 


6. Conclusion. The efficiency of the signal-flow based Runge-Kutta methods de¬ 
pends strongly on the characteristic properties of the system. The inverter chain 
example shows that if during the simulation large parts of the system are latent 
and function evaluations are comparatively time-consuming, then the signal-flow 
based methods result in a substantially reduced runtime while introducing only a 
small deviation compared to the corresponding standard Runge-Kutta methods. 
If, on the other hand, large parts are periodic with a fundamental period T, then 
the signal-flow based methods for periodic systems can be used to speed up the 
simulation. The following example summarizes these results. 

Example 7. Figure 12 shows a comparison of the signal-flow based standard 
Runge-Kutta method and the corresponding method for periodic systems. If T 
is small, then the periodicity-oriented Runge-Kutta method is more efficient since 
the circuit is active most of the time. With increasing T, the latency exploitation 
becomes more efficient. 
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Figure 12. Comparison of sfRK and sfpRK. 


7. Further extensions. To utilize not only the temporal latency, i.e. inactivity 
over a period of time, but also the spatial latency, i.e. inactivity during the Newton- 
Raphson iterations, the proposed techniques might be applicable as well. This could, 
for example, be used to speed up the DC analysis, exploiting the fact that some 
parts of the circuit possibly converge rapidly to a solution while other parts converge 
only very slowly. 
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